Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Adversarial attacks pose significant challenges in many machine learning applications, particularly in the setting of distributed training and federated learning, where malicious agents seek to corrupt the training process with the goal of jeopardizing and compromising the performance and reliability of the final models. In this paper, we address the problem of robust federated learning in the presence of such attacks by formulating the training task as a bi-level optimization problem. We conduct a theoretical analysis of the resilience of consensus-based bi-level optimization (CB2O), an interacting multi-particle metaheuristic optimization method, in adversarial settings. Specifically, we provide a global convergence analysis of CB2O in mean-field law in the presence of malicious agents, demonstrating the robustness of CB2O against a diverse range of attacks. Thereby, we offer insights into how specific hyperparameter choices enable to mitigate adversarial effects. On the practical side, we extend CB2O to the clustered federated learning setting by proposing FedCB2O, a novel interacting multi-particle system, and design a practical algorithm that addresses the demands of real-world applications. Extensive experiments demonstrate the robustness of the FedCB2O algorithm against label-flipping attacks in decentralized clustered federated learning scenarios, showcasing its effectiveness in practical contexts. This article is part of the theme issue ‘Partial differential equations in data science’.more » « lessFree, publicly-accessible full text available June 5, 2026
-
Abstract Adversarial training is a min-max optimization problem that is designed to construct robust classifiers against adversarial perturbations of data. We study three models of adversarial training in the multiclass agnostic-classifier setting. We prove the existence of Borel measurable robust classifiers in each model and provide a unified perspective of the adversarial training problem, expanding the connections with optimal transport initiated by the authors in their previous work [21]. In addition, we develop new connections between adversarial training in the multiclass setting and total variation regularization. As a corollary of our results, we provide an alternative proof of the existence of Borel measurable solutions to the agnostic adversarial training problem in the binary classification setting.more » « lessFree, publicly-accessible full text available December 3, 2025
-
Abstract In the standard formulation of the classical denoising problem, one is given a probabilistic model relating a latent variable $$\varTheta \in \varOmega \subset{\mathbb{R}}^{m} \; (m\ge 1)$$ and an observation $$Z \in{\mathbb{R}}^{d}$$ according to $$Z \mid \varTheta \sim p(\cdot \mid \varTheta )$$ and $$\varTheta \sim G^{*}$$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $$\varTheta $$ from $$Z$$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $$Z$$, and in general may fail to capture the geometric features of the prior distribution $$G^{*}$$ (e.g. low dimensionality, discreteness, sparsity). To rectify these drawbacks, in this paper we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $$Z$$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, due to Tweedie’s formula, when the likelihood model $$\{ p(\cdot \mid \theta ) \}_{\theta \in \varOmega }$$ is an exponential family of distributions, the OT-based denoiser can be recovered solely from the marginal distribution of $$Z$$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.more » « less
-
Abstract We propose iterative algorithms to solve adversarial training problems in a variety of supervised learning settings of interest. Our algorithms, which can be interpreted as suitable ascent-descent dynamics in Wasserstein spaces, take the form of a system of interacting particles. These interacting particle dynamics are shown to converge toward appropriate mean-field limit equations in certain large number of particles regimes. In turn, we prove that, under certain regularity assumptions, these mean-field equations converge, in the large time limit, toward approximate Nash equilibria of the original adversarial learning problems. We present results for non-convex non-concave settings, as well as for non-convex concave ones. Numerical experiments illustrate our results.more » « less
-
We analyze the convergence properties of Fermat distances, a family of density-driven metrics defined on Riemannian manifolds with an associated probability measure. Fermat distances may be defined either on discrete samples from the underlying measure, in which case they are random, or in the continuum setting, where they are induced by geodesics under a density-distorted Riemannian metric. We prove that discrete, sample-based Fermat distances converge to their continuum analogues in small neighborhoods with a precise rate that depends on the intrinsic dimensionality of the data and the parameter governing the extent of density weighting in Fermat distances. This is done by leveraging novel geometric and statistical arguments in percolation theory that allow for non-uniform densities and curved domains. Our results are then used to prove that discrete graph Laplacians based on discrete, sample-driven Fermat distances converge to corresponding continuum operators. In particular, we show the discrete eigenvalues and eigenvectors converge to their continuum analogues at a dimension-dependent rate, which allows us to interpret the efficacy of discrete spectral clustering using Fermat distances in terms of the resulting continuum limit. The perspective afforded by our discrete-to-continuum Fermat distance analysis leads to new clustering algorithms for data and related insights into efficient computations associated to density-driven spectral clustering. Our theoretical analysis is supported with numerical simulations and experiments on synthetic and real image data.more » « less
An official website of the United States government

Full Text Available